Recurrent Neural Networks with Word Embeddings

based on:

http://deeplearning.net/tutorial/rnnslu.html

This notebook takes the material from that page, downloads it, and changes it to work with Python3.

First, we delete any left over stuff from a previous run:

In [16]:
!rm -rf is13 atis.pkl*

Now, we get the example code:

In [17]:
!git clone https://github.com/mesnilgr/is13.git
Cloning into 'is13'...
remote: Counting objects: 71, done.
remote: Total 71 (delta 0), reused 0 (delta 0), pack-reused 71
Unpacking objects: 100% (71/71), done.
Checking connectivity... done.

and get the sample data:

In [18]:
!curl -o atis.pkl.gz http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.pkl.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  214k  100  214k    0     0  1669k      0 --:--:-- --:--:-- --:--:-- 1673k
In [19]:
!gunzip atis.pkl.gz

Now, we convert the example Python2 code to Python3:

In [24]:
!2to3-3.4 -w is13
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
RefactoringTool: Skipping implicit fixer: buffer
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
RefactoringTool: Skipping implicit fixer: idioms
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
RefactoringTool: Skipping implicit fixer: set_literal
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
RefactoringTool: Skipping implicit fixer: ws_comma
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
root: Generating grammar tables from /usr/lib/python3.4/lib2to3/PatternGrammar.txt
RefactoringTool: Refactored is13/data/load.py
RefactoringTool: Refactored is13/examples/elman-forward.py
RefactoringTool: Refactored is13/examples/jordan-forward.py
RefactoringTool: Refactored is13/metrics/accuracy.py
RefactoringTool: No changes to is13/rnn/elman.py
RefactoringTool: No changes to is13/rnn/jordan.py
RefactoringTool: Refactored is13/utils/tools.py
RefactoringTool: Files that were modified:
RefactoringTool: is13/data/load.py
RefactoringTool: is13/examples/elman-forward.py
RefactoringTool: is13/examples/jordan-forward.py
RefactoringTool: is13/metrics/accuracy.py
RefactoringTool: is13/rnn/elman.py
RefactoringTool: is13/rnn/jordan.py
RefactoringTool: is13/utils/tools.py
In [62]:
!sed -i "s/load(f)/load(f, encoding=\"latin1\")/" is13/data/load.py
In [5]:
!sed -i "s/\//\/\//g" is13/utils/tools.py
In [6]:
!sed -i "s/open(filename).read()/bytes(open(filename).read(), \"utf-8\")/" is13/metrics/accuracy.py
In [8]:
!sed -i "s/stdout.split/str(stdout, \"utf-8\").split/" is13/metrics/accuracy.py

Example

We import the libraries:

In [1]:
import numpy
import time
import sys
import subprocess
import os
import random

from is13.data import load
from is13.rnn.elman import model
from is13.metrics.accuracy import conlleval
from is13.utils.tools import shuffle, minibatch, contextwin
from functools import reduce

and run an experiment:

In [2]:
s = {'fold':3, # 5 folds 0,1,2,3,4
     'lr':0.0627142536696559,
     'verbose':1,
     'decay':False, # decay on the learning rate if improvement stops
     'win':7, # number of words in the context window
     'bs':9, # number of backprop through time steps
     'nhidden':100, # number of hidden units
     'seed':345,
     'emb_dimension':100, # dimension of word embedding
     'nepochs':50}

folder = "is13/examples"
if not os.path.exists(folder): os.mkdir(folder)

# load the dataset
train_set, valid_set, test_set, dic = load.atisfold(s['fold'])
idx2label = dict((k,v) for v,k in dic['labels2idx'].items())
idx2word  = dict((k,v) for v,k in dic['words2idx'].items())

train_lex, train_ne, train_y = train_set
valid_lex, valid_ne, valid_y = valid_set
test_lex,  test_ne,  test_y  = test_set

vocsize = len(set(reduce(\
                   lambda x, y: list(x)+list(y),\
                   train_lex+valid_lex+test_lex)))

nclasses = len(set(reduce(\
                   lambda x, y: list(x)+list(y),\
                   train_y+test_y+valid_y)))

nsentences = len(train_lex)

# instanciate the model
numpy.random.seed(s['seed'])
random.seed(s['seed'])
rnn = model(    nh = s['nhidden'],
                nc = nclasses,
                ne = vocsize,
                de = s['emb_dimension'],
                cs = s['win'] )

# train with early stopping on validation set
best_f1 = -numpy.inf
s['clr'] = s['lr']
for e in range(s['nepochs']):
    # shuffle
    print("Epoch...", e)
    sys.stdout.flush()
    shuffle([train_lex, train_ne, train_y], s['seed'])
    s['ce'] = e
    tic = time.time()
    for i in range(nsentences):
        cwords = contextwin(train_lex[i], s['win'])
        words  = [numpy.asarray(x).astype('int32') for x in minibatch(cwords, s['bs'])]
        labels = train_y[i]
        for word_batch , label_last_word in zip(words, labels):
            rnn.train(word_batch, label_last_word, s['clr'])
            rnn.normalize()
        if s['verbose']:
            print('[learning] epoch %i >> %2.2f%%'%(e,(i+1)*100./nsentences),'completed in %.2f (sec) <<\r'%(time.time()-tic), end=' ')
            sys.stdout.flush()

    # evaluation // back into the real world : idx -> words
    predictions_test = [ [idx2label[x] for x in rnn.classify(numpy.asarray(contextwin(x, s['win'])).astype('int32'))]\
                         for x in test_lex ]
    groundtruth_test = [ [idx2label[x] for x in y] for y in test_y ]
    words_test = [ [idx2word[x] for x in w] for w in test_lex]

    predictions_valid = [ [idx2label[x] for x in rnn.classify(numpy.asarray(contextwin(x, s['win'])).astype('int32'))]\
                         for x in valid_lex ]
    groundtruth_valid = [ [idx2label[x] for x in y] for y in valid_y ]
    words_valid = [ [idx2word[x] for x in w] for w in valid_lex]

    # evaluation // compute the accuracy using conlleval.pl
    res_test  = conlleval(predictions_test, groundtruth_test, words_test, folder + '/current.test.txt')
    res_valid = conlleval(predictions_valid, groundtruth_valid, words_valid, folder + '/current.valid.txt')

    if res_valid['f1'] > best_f1:
        rnn.save(folder)
        best_f1 = res_valid['f1']
        if s['verbose']:
            print('NEW BEST: epoch', e, 'valid F1', res_valid['f1'], 'best test F1', res_test['f1'], ' '*20)
        s['vf1'], s['vp'], s['vr'] = res_valid['f1'], res_valid['p'], res_valid['r']
        s['tf1'], s['tp'], s['tr'] = res_test['f1'],  res_test['p'],  res_test['r']
        s['be'] = e
        subprocess.call(['mv', folder + '/current.test.txt', folder + '/best.test.txt'])
        subprocess.call(['mv', folder + '/current.valid.txt', folder + '/best.valid.txt'])
    else:
        print('')

    # learning rate decay if no improvement in 10 epochs
    if s['decay'] and abs(s['be']-s['ce']) >= 10: s['clr'] *= 0.5 
    if s['clr'] < 1e-5: break

print('BEST RESULT: epoch', e, 'valid F1', s['vf1'], 'best test F1', s['tf1'], 'with the model', folder)
Epoch... 0
 NEW BEST: epoch 0 valid F1 86.65 best test F1 83.13                     
Epoch... 1
 NEW BEST: epoch 1 valid F1 89.99 best test F1 87.25                     
Epoch... 2
 NEW BEST: epoch 2 valid F1 92.8 best test F1 89.52                     
Epoch... 3
 
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-2-fdc358eb1f6e> in <module>()
     52         labels = train_y[i]
     53         for word_batch , label_last_word in zip(words, labels):
---> 54             rnn.train(word_batch, label_last_word, s['clr'])
     55             rnn.normalize()
     56         if s['verbose']:

/usr/local/lib/python3.4/dist-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs)
    593         t0_fn = time.time()
    594         try:
--> 595             outputs = self.fn()
    596         except Exception:
    597             if hasattr(self.fn, 'position_of_error'):

/usr/local/lib/python3.4/dist-packages/theano/scan_module/scan_op.py in rval(p, i, o, n, allow_gc)
    670         def rval(p=p, i=node_input_storage, o=node_output_storage, n=node,
    671                  allow_gc=allow_gc):
--> 672             r = p(n, [x[0] for x in i], o)
    673             for o in node.outputs:
    674                 compute_map[o][0] = True

/usr/local/lib/python3.4/dist-packages/theano/scan_module/scan_op.py in <lambda>(node, args, outs)
    659                         args,
    660                         outs,
--> 661                         self, node)
    662         except (ImportError, theano.gof.cmodule.MissingGXX):
    663             p = self.execute

scan_perform.pyx in theano.scan_module.scan_perform.perform (/home/dblank/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-3.4.0-64/scan_perform/mod.cpp:3537)()

/usr/local/lib/python3.4/dist-packages/theano/gof/op.py in rval(p, i, o, n)
    766             # default arguments are stored in the closure of `rval`
    767             def rval(p=p, i=node_input_storage, o=node_output_storage, n=node):
--> 768                 r = p(n, [x[0] for x in i], o)
    769                 for o in node.outputs:
    770                     compute_map[o][0] = True

/usr/local/lib/python3.4/dist-packages/theano/tensor/blas.py in perform(self, node, inputs, out_storage)
    396             #                         overwrite_y=self.inplace)
    397             out_storage[0][0] = gemv(alpha, A.T, x, beta, y,
--> 398                                      overwrite_y=self.inplace, trans=True)
    399         else:
    400             out = numpy.dot(A, x)

KeyboardInterrupt: